The Bitwise Bloom Filter
نویسندگان
چکیده
We present the Bitwise Bloom Filter, a data structure for maintaining counts for a large number of items. The bitwise filter is an extension of the Bloom filter, a space-efficient data structure for storing a large set efficiently by discarding the identity of the items being held while still being able to determine whether it is in the set or not, with high probability. We show how this idea can be extended to maintaining counts of items by maintaining a separate Bloom filter for every position in the bit representations of all the counts. We give both theoretical analysis of the accuracy of the Bitwise filter together with validation via experiments on real network data.
منابع مشابه
md5bloom: Forensic filesystem hashing revisited
Hashing is a fundamental tool in digital forensic analysis used both to ensure data integrity and to efficiently identify known data objects. However, despite many years of practice, its basic use has advanced little. Our objective is to leverage advanced hashing techniques in order to improve the efficiency and scalability of digital forensic analysis. Specifically, we explore the use of Bloom...
متن کاملPrivate record linkage with Bloom filters
In many record linkage applications, identifiers have to be encrypted to preserve privacy. Therefore, a method for approximate string comparison in private record linkage is needed. We describe a new method of approximate string comparison in private record linkage. The main idea is to store q-grams sets derived from identifier values in Bloom filters and compare them bitwise across databases. ...
متن کاملA Cuckoo Filter Modification Inspired by Bloom Filter
Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reaso...
متن کاملReducing False Positives of a Bloom Filter using Cross-Checking Bloom Filters
A Bloom filter is a compact data structure that supports membership queries on a set, allowing false positives. The simplicity and the excellent performance of a Bloom filter make it a standard data structure of great use in many network applications. In reducing the false positive rate of a Bloom filter, it is well known that the size of a Bloom filter and accordingly the number of hash indice...
متن کاملDon't Thrash: How to Cache Your Hash on Flash
This paper presents new alternatives to the well-known Bloom filter data structure. The Bloom filter, a compact data structure supporting set insertion and membership queries, has found wide application in databases, storage systems, and networks. Because the Bloom filter performs frequent random reads and writes, it is used almost exclusively in RAM, limiting the size of the sets it can repres...
متن کامل